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Multicanonical molecular dynamics (MD) is a powerful technique for sampling conformations on 
rugged potential surfaces such as protein. However, it is notoriously difficult to estimate the multi- 
canonical temperature effectively. Wang and Landau developed a convenient method for estimating 
the density of states based on a multicanonical Monte Carlo method. In their method, the density of 
states is calculated autonomously during a simulation. In this paper, we develop a set of techniques 
to effectively apply the Wang-Landau method to MD simulations. In the multicanonical MD, the 
estimation of the derivative of the density of states is critical. In order to estimate it accurately, we 
devise two original improvements. First, the correction for the density of states is made smooth by 
using the Gaussian distribution obtained by a short canonical simulation. Second, an approximation 
is applied to the derivative, which is based on the Gaussian distribution and the multiple weighted 
histogram technique. A test of this method was performed with small polypeptides, Met-enkephalin 
and Trp-cage, and it is demonstrated that Wang-Landau MD is consistent with replica exchange 
MD but can sample much larger conformational space. 

PACS numbers: 02.70.Ns, 87.15.-v 



I. INTRODUCTION 

Computer simulation has been established as a tech- 
nique for studying the systems with many degrees of free- 
dom such as spin glasses and proteins. The difficulties in 
studying such systems stem from the fact that the con- 
formational space is very large and there exist a huge 
number of local minimum energy states. For studying 
interesting phenomena or calculating quantities such as 
the transition of states or the free energy of the system, a 
global sampling of the conformational space and uniform 
sampling on energy space is desired. Canonical simula- 
tions at high temperatures realize the global sampling 
but low energy states are poorly sampled. At low tem- 
peratures, simulations can sample low energy states but 
the global search is difficult. In order to overcome these 
difficulties, many generalized-ensemble algorithms have 
been developed such as the multicanonical algorithm [lj 
and the replica exchange method (REM) 0, Q . 

The multicanonical method is based on artificial weight 
factor by which a simulation realizes a flat energy his- 
togram. This weight factor is not known a priori and 
has to be determined by iterating simulations and esti- 
mations. If the value of the weight factor is partially too 
large or small, then it causes to disturb a random walk in 
energy space. Therefore, the multicanonical weight fac- 
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tor needs to be fairly accurate. However, obtaining an ac- 
curate weight factor is often difficult and requires a great 
deal of expertise. In REM, a number of non-interacting 
replicas of the original system at different temperatures 
are simulated independently and simultaneously. Every 
few steps the temperatures of pairs of replicas are ex- 
changed, subject to a detailed balance condition. As the 
system size increases, however, the required number of 
replicas also greatly increases. In such a huge simulation 
system, it takes an unrealistically long CPU time for a 
replica to traverse the entire temperature range. 

Wang and Landau [4] proposed a simulation method to 
sample conformational space efficiently. In their method 
(hereafter referred to as the WL method) , the weight fac- 
tor is automatically estimated while the multicanonical 
simulation is performed. This method is originally based 
on a Monte Carlo (MC) algorithm. 

In this paper, we focus on efficient sampling over the 
conformational space of an all-atom model of protein. 
Molecular dynamics (MD) and MC are the standard tech- 
niques for all atom protein simulations. It is recognized 
that the sampling efficiency of the MC method is com- 
parable, or sometimes superior to MD for liquid simula- 
tions. In protein simulations, however, MD shows about 
1.5 times better sampling efficiency [5]. This difference 
is attributed to the inertia force term in MD, which does 
not exist in MC. 

Recently, MD with the WL method was applied to 
simulations of solutions @ in which the original WL al- 
gorithm was applied to MD simulations in a straightfor- 
ward manner. As we mention in detail later, the simple 
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application of the WL method to MD does not work well 
for a protein system for a number of reasons. First, in 
the multicanonical MD or related methods such as the 
WL MD, not only the density of states but its derivative 
are required. In the WL method, the density of states is 
rugged, so that its derivative is of poor accuracy. Second, 
as the WL method realizes quick random walk in energy 
space then for a polymer such as a protein, temperature 
changes occur in a time scale shorter than is necessary for 
a change of the structure of the whole molecule. Even- 
tually it takes a long time for a meaningful structural 
change to occur. In this paper, we present techniques for 
implementing an efficient WL MD method to circumvent 
these problems. 

This paper is organized as follows. We describe a de- 
tailed formulation of the Wang-Landau molecular dy- 
namics (WLMD) method in Sec. |TTJ In Sec. IIII1 we 
introduce the model system for the present study, the 
Met-enkephalin, a 5-residue peptide, and the Trp-cage, 
a 20-residue protein. Next, we compare the sampling ef- 
ficiency of the current method with that of the replica 
exchange MD (REMD) method in Sec. [TV] 



II. METHOD 

A. The Wang-Landau method 

Wang and Landau have developed a very powerful MC 
simulation technique for efficiently sampling conforma- 
tional space. In their method, the transition probability 
from energy E\ to E 2 is given by 



p(E 1 — > E 2 ) = min 



n(E 1 
n(E 2 



1 



(1) 



where n{E) is the density of states. This is the same as 
in the standard multicanonical MC method. Since n(E) 
is not known a priori, one has to determine it by some 
means. In conventional multicanonical simulations, it is 
determined by iterating short preliminary simulations us- 
ing 

lnn (,;+1) (£;) = \nn {i) {E) + In H (l) (E) V£ G £, (2) 

where n^(E) and H^'{E) are the density of states and 
the histogram in the ith simulation, respectively, and £ is 
a set of allowed energy levels. What is unique of the WL 
method resides in the scheme for updating n(E). When 
an energy level, Ei, is visited at the time step, i, the 
existing n{E) is modified by a modification factor 7 > 1, 
i. e., 



(E) 



~/nW(E), if E = Ei, 
(E), otherwise. 



(3) 



For a given energy, E, if rS l > (E) is smaller than the true 
density of state, n(E), the energy state E is sampled in- 
tensively so that rv- 1 ' {E) is updated to approach the true 



value, n(E). On the contrary, if n^(E) > n(E), the 
sampling of the energy state, E, is suppressed and other 
energy states are extensively sampled. Although n(E) is 
unknown at the very beginning of simulation, n^(E) ap- 
proaches the true value quickly and automatically. Since 
nS l '{E) is updated every step, it is possible for the system 
to escape local minima in a very short period of time. 



B. The Wang-Landau molecular dynamics 

We apply the WL method to MD simulations. The 
equation of motion for the multicanonical MD is given 

by 0,1 



dvk 
~df 



Pn 



Pmu dE 

fa dx k 
dhxn(E) 
dE ' 



(4) 
(5) 



where rrifc, Vk, and %k represent mass, velocity, and 
coordinate of fcth atom, and the inverse temperature, 
/3q = 1/fceTo (fee is the Boltzmann constant), with the 
simulation temperature, To. /3 mu is referred to as the 
multicanonical temperature. In a straightforward appli- 
cation of the WL method to MD, the density of states, 
n(E), could be estimated according to Eq. ([3]). However, 
such a simple method does not work well in practice. It is 
necessary to accurately estimate not only n(E) but also 
its derivative /3 mu . In numerical calculation, n(E) is di- 
vided into bins, and each bin is given a discrete value. In 
general, the estimate of /3 mu is of low accuracy due to the 
ruggedness between neighboring bins. In order to smooth 
this ruggedness, we approximate canonical energy distri- 
bution at various temperatures by the Gaussian distri- 
bution and combine them with the weighted histogram 
method (WHAM) [g[ for updating n^(E) and estimat- 
ing the multicanonical temperature. 

In the canonical ensemble at an inverse temperature, 
P, the distribution of energy E is written as 



S-0E 



Pp(E)=n{E)e 



where n{E) is the density of states and 



-' = 5>(i5) e - 

E 



■0E 



(6) 



(7) 



The last equation [Eq. defines the partition function. 
The canonical distribution usually has a bell-like shape. 
At the energy -E pea k corresponding to the peak of the 
distribution, the equation d In Pp (E) /dE = is satisfied. 
This is rewritten as 



d\nn{E) 



dE 



p = o. 



(8) 



E=E„ 



By comparing Eqs. |5]) and JHJ), one can see that the 
multicanonical temperature /3 mu at energy E is the tem- 
perature of a canonical ensemble whose peak energy is E. 
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If -Speak and n(E) are known, it is possible to solve Eq. 
([5]) for (3. However, it is difficult to evaluate -E pea k since 
dlnn(E) / dE is inaccurate if n(E) is estimated from a 
simulation (which is the case in practice). On the other 
hand, it is relatively easy to estimate the thermal average 
{E)p as 



(E) = 



J2 E Enil) (E)e- 0E 



■0E 



(9) 



where nW(B) is the density of states estimated from the 
simulation. The variance of the energy can be estimated 
in the same manner, 



E E (E-(E) ) 2 n^(E)e-^ 
J2 E nV)(E)e-0 E 



(10) 



Hansmann [T(| suggested using (E) ^ instead of i? pca k- 
In his prescription, /3 mu is determined from the table of 
(E)g. The canonical distribution of the systems that we 
are interested in is similar to the Gaussian distribution. 
Using (E)g and the deviation a 2 at /3, we assume that 
the following approximation holds: 



P/s(E) = 



ex P 1 2ST^ 



22e> ex P ( 2^T^ 



(11) 



which can be rearranged, using Eqs. ^ and (HJ), to 



Ej5' ex P (" 



It is possible to calculate the right-hand side from 
a canonical simulation, but the accuracy decrease as 



\^E — {E)pj increases. In order to keep the accuracy 

for a wide energy range, n(E) is estimated by using the 
WHAM technique with a number of temperatures, 



n(E) 



where 



5>(e) ( 



-0iE 



(13) 



(14) 



and Wj's are appropriate weight factors (see below). n(E) 
and fj can be obtained by solving Eqs. (fTBf and ([M]) 
iteratively. From Eqs. (fTTj) and (fl~3|) . it follows that 



dlnn(E) E^^'-i'A^ 



(15) 



which can be further reduced, by using Eq. ([6]), to 



= (3{E). 



(16) 

Both n(-E) and /j vanish in this expression. We use this 
equation for estimating the multicanonical temperatures. 
Note that Eq. (fTI]) is not exact. This implies that our 
method may not be suitable for a system with doubly 
peaked distribution, for example. Nevertheless, such an 
error should be small if we use a sufficiently large number 
of temperatures. 

In a WLMD simulation, we need to specify the lowest 
and highest temperatures for the system (/3 m i n and /3 m ax, 
respectively), and prepare energy bins and temperatures 
corresponding to each energy bin. The initial estimate of 
the density of states, rv- x '(E), is obtained as follows. We 
first run a short (say, 100 steps) canonical MD simulation 
at the inverse temperature /3 max (corresponding to the 
highest temperature), and calculate the average and 
variance of the energy from this short simulation. We 
then apply Eqs. (fT!?)) and ([7]) iteratively which yields the 
initial estimate n^(E). The rest of a WLMD proceeds 
as follows. 



(1) Set % = 1 and /3£i = /3 max . 

(2) A canonical MD is performed for a very short 
period of time (say, 100 steps) according to Eq. ([5]) 

(i) 

with fixed /3mu- During this period, the time average 
Ep(i) and the variance er ( 4) of the energy are estimated. 

The Gaussian energy distribution based on these values 
is denoted (E). Let the energy value of the last 

step of this canonical simulation be Ej£L. Let = 



l a st ' 

maxfc=i,...,i^„(»,). 



min*=i,...,<-E , 0(io and E^ 1 

(3) The average and variance of energy at the inverse 
temperature /3 max are estimated using all the energy val- 
ues that have been visited with /3 mu = /3 max up to this 
time step. Using these values, the Gaussian distribution 
Pf 1 (E) is obtained. Similarly, the Gaussian distribu- 
tion at m i a , P^ in (E), is estimated. Pg^(E) is set to 
zero if the simulation has never visited this temperature. 

(4) For each energy bin, Ej, calculate the correspond- 
ing temperature fn' = (3^{Ej) according to Eq. |T 
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with the Gaussians P (l -i } (E), pjf* (E), P? J {E), and 

P plu( E )- The weight Wj is set to 1 except for PfljE) 
for which the weight is set to a specified modification 
factor w mu > [this factor corresponds to 7 in Eq. ([3])]. 
(Note that, for i = 1, Pm-x) (E) is not yet defined and is 

set to zero.) 

(5) Using the temperatures /3j obtained in the pre- 
vious step, calculate, from Eqs. © and (fTU|) as well as 
the current estimate of the density of states n^(E), the 
average and variance, (E) „(i) and a w , and the Gaussian 

™3 Pi 



4 



distribution, PpW (E) 



(6) Iterate steps 4 and 5 until all the (3j s converge. 

(7) Update the density of states using Eqs. IT3]) 
and (I14|) and the Gaussians to obtain the next estimate 
n^+V(E). 

(8) Estimate the next multicanonical temperature 



i+i) 



according to Eq. dTHJ) at the energy of the last 



canonical MD step, E ] 



last ' 



If < E%£>* or > 



E mu x '\ is evaluated at E™™> 1 or E^ K l , respec- 

tively. This is necessary for a stable simulation. 
(9) Set i = i + 1 and go back to step 2. 



with 



dlnn(E h ,E n ) 

dE h 
8\nn(E h ,E n ) 

dE n 



(20) 
(21) 



Then it is possible to give different temperatures for bond 
and non-bond interactions. In order to suppress the mo- 
tion due to the bond interaction and to facilitate global 
conformational changes, /% is fixed whereas /3 n is var- 
ied according to the WLMD scheme from a low to high 
temperature. 



In order to avoid numerical overflow in the calculation 
of (E) in the step 7, we limit the summation in Eq. 

(fT5)l to those inverse temperatures (3^ satisfying 

iCT < (E) w < (17) 

Recently, Kim et al. [ll| developed the method to de- 
termine the multicanonical temperature automatically. 
The notable points in their method are the short pe- 
riod update and adding the derivative of histogram to 
the multicanonical temperature. While their method is 
different from our method in the way to update the his- 
togram, the idea of the short term update is common. 
They use the finite difference of raw data (energy his- 
togram) for update. Since the present method uses the 
Gaussian mask and the WHAM technique to estimate 
the density of states and its derivative, it is expected to 
be more robust. 



C. Separation of bond and non-bond interaction 

When we first applied the above method to a protein 
system, a random walk in the energy space was read- 
ily realized. However, the protein conformation changed 
very little in the whole process. It was found that the 
energy change predominantly originated from the bond 
length and the bond angle deviations. As a result, the 
bond energy almost solely contributed to n(E). To avoid 
this artifact, we separated the bond interaction from the 
non-bond interaction, and applied the multicanonical en- 
semble only to the non-bond interaction. The energies of 
bond and non-bond interactions are written as E^ and 
E n , respectively, i. e., 

E = E h + E n . (18) 



III. MODEL FOR NUMERICAL CALCULATION 

We carried out simulations using a customized version 
of the PRESTO molecular simulation package [13], in 
which the all-atom version of the AMBER force-field pa- 
rameters (C96) [l3[ was used. 

A. Met-enkephalin 

We used the Met-enkephalin [Protein Data Bank 
(PDB) code: 1PLW; amino acid sequence: YGGFM] 
as a model system for checking the correctness of the 
method. Eight independent WLMD simulations were 
conducted for 10 ns (80 ns in total), starting from the 
extended conformation with the temperature ranging be- 
tween 200 and 700 K. No solvent effect was included but a 
distance-dependent dielectric constant was used (e = 4r, 
where r is the inter-atomic distance). The density of 
states was updated every 1000 steps (unit time step was 
0.5 fs). The objective of these simulations is to check 
whether the present method can yield a plausible density 
of states compared to the REMD simulations. Therefore, 
the modification factor for the density of states was ini- 
tially set to wp nlu — 12.8 which was decreased by the 
factor of 0.8 every 1 ns. This procedure is analogous to 
that proposed originally by Wang and Landau [1]. 

For comparison, we carried out an extensive REMD 
simulation with eight replicas, each running 100 ns. Tem- 
peratures ranged between 200 and 700 K, and were dis- 
tributed exponentially. Replica-exchanges were tried ev- 
ery 20 steps. The average exchange acceptance ratio was 
approximately 18.6%. For the calculation of principal 
components and density of states, those conformations 
generated during the first 1 ns of the simulation were 
discarded. 



Using Ey, and E n , density of state is represented as 
n(i?b, -En)- The equations of motion with the n(£'b, E n ) 
is now expressed as 



mk 



dvk _ _(3bdEb _ Pn dE n 
dt (3 dxk (3 dxk 



(19) 



B. Trp-cage 

The protein used for demonstrating the efficiency of 
the method is a 20-residue protein, Trp-cage (PDB code: 
1L2Y; amino acid sequence: NLYIQ WLKDG GPSSG 



5 



RPPPG), which was derived from the C-terminal frag- 
ments of a 39-residue exendin-4 peptide [lij]. Physic- 
ochemical studies showed that Trp-cage folds sponta- 
neously and cooperatively. It contains a short a-helix 
in residues 2-9, a 3io-helix in residues 11-14, and a C- 
tcrminal polyprolinc II helix to pack against the central 
tryptophan (Trp-6). Several simulation studies on this 
Trp-cage have already been published [HI, [l|| [13, HI] • 

An implicit solvent model [l9j was used with the di- 
electric constant was set equal to 4r, where r is distance 
between point charges. Furthermore, we employed an 
empirical dihedral angle potential which was developed 
in our laboratory [2(| • The simulations were started from 
an extended conformation. The unit time step was set 
to 0.5 fs, and we ran MD simulations for 10 ns. We set 
T min = 1/fcnAnin = 200 K, T max - l/k B (3 maK = 700 
K, Tb = 300 K, the multicanonical temperature was up- 
dated every 400 steps (0.2 ps). The number of energy 
bins was set to 256. We carried out 24 runs with differ- 
ent random initial velocities. The modification factor for 
the density of states is set such that W/3 mu = 12.8. 

For comparison, an alternative simulation with REMD 
was performed under the equivalent condition, that is, 24 
replicas each running for 10 ns. Temperatures exponen- 
tially varied from 200 to 700 K and swapping was carried 
out every 400 steps. The average acceptance ratio was 
33.1%. 

The run time required for one WLMD simulation was 
184-193 h (189 h on average) on an Intel Pentium III 
(1GHz) processor. For REMD, the run time per CPU was 
207 h. WLMD seems slightly more efficient than REMD 
in spite of the extra effort required for the WHAM iter- 
ations. The lower computational efficiency of REMD is 
caused by the synchronization process which is necessary 
for different replicas to exchange. 



IV. RESULTS AND DISCUSSION 

In order to check the correctness of WLMD, we first 
compare the results of WLMD and REMD for a small 
peptide Met-enkephalin. Figure Q] shows that the con- 
formational spaces sampled by the two methods closely 
overlap, indicating that WLMD can sample the confor- 
mational space as widely as REMD. However, it should 
be noted that 10 times more points are plotted for REMD 
than WLMD: if we plot the same number of points, the 
REMD result looks much more sparse. 

For a WLMD simulation to work at all, good estimate 
of the density of states should be obtained as the sim- 
ulation proceeds. We compare the density of states ob- 
tained by eight WLMD simulations with that obtained 
by a long REMD simulation in Fig. O Again, we see that 
the results of the two methods agree quite well. A close 
examination reveals that WLMD slightly overestimates 
the density of low energy (< 70 kcal/mol) states (Fig. [21 
lower part) compared to REMD. Nevertheless, such dif- 
ference is of a small fraction (sa 3%) of the entire range of 
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FIG. 1: Conformations of Met-enkephalin sampled by Wang- 
Landau molecular dynamics (WLMD, upper panel) and 
replica-exchange molecular dynamics (REMD, lower panel) 
projected onto the first two principal component axes (PCA) . 
During WLMD 160 000 conformations were saved and 10 000 
lowest energy conformations were selected. In the same man- 
ner, 10 000 lowest energy conformations were selected for 
REMD. These 20 000 conformations in total were used for 
determining the principal axes. Upper panel: 10 000 low- 
est energy conformations obtained by WLMD are projected. 
Lower panel: 100 000 lowest energy conformations obtained 
by REMD are projected. Each pixel is colored according to 
the cRMS from the PDB structure. 

the density of states. Since WLMD is simply a method to 
efficiently estimate the density of states for multicanoni- 
cal MD, we can always further refine the density of states 
using an ordinary multicanonical MD procedure. 

We next turn to our main application of WLMD: sam- 
pling of low-energy conformations of Trp-cage. In order 
to check if the density of states is still accurately esti- 
mated for this bigger protein, we plotted the density of 
states obtained by three independent WLMD simulations 
(Fig. [3]). Although we cannot know the exact density of 
states, the consistency among these estimates of density 
of states suggests that the present scheme is at least self- 
consistent. In order to calculate thermodynamic quan- 
tities, one should gradually decrease the value of w mn 
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FIG. 2: Comparison of density of states of Met-enkephalin 
obtained by WLMD (blue lines) and REMD (red line). Eight 
density of states were obtained from eight WLMD simulations 
(blue lines), one density of states was calculated from one 
REMD simulation with eight replicas (red line). The lower 
panel shows the difference of each density of states of WLMD 
from that of REMD. 
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FIG. 3: Density of states of Trp-cage obtained from three 
independent Wang-Landau molecular dynamics simulations. 
Each line represents the density of states obtained from one 
WLMD run. 



toward zero and set w m u = for the production run, as 
suggested by Wang and Landau In the present ex- 
ample with Trp-cage, our purpose is efficient sampling 
of low energy conformations but not calculating thermo- 
dynamic quantities so that w mu is always kept greater 
than 0. We next compare the (non-bond) energy trajec- 
tories of WLMD and REMD simulations (Fig. gj). We 
selected three trajectories arbitrarily from 24 simulations 
or replicas. It is immediately apparent that the WLMD 
trajectories (Fig. |H upper panel) traverse a wide range of 
energy very rapidly compared to the REMD trajectories 
(Fig. 0] lower panel). In fact, one trajectory of REMD 
seems to be trapped in a local minimum. 

A similar trend is also observed for temperature trajec- 
tories (Fig. O. The multicanonical temperature changes 
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FIG. 4: Time series of non-bond energy of Trp-cage. Upper 
panel: Wang-Landau molecular dynamics; the trajectories for 
the last 1 ns are enlarged for clarity. Lower panel: Replica 
exchange molecular dynamics. Arbitrarily selected three tra- 
jectories out of 24 are shown. 



very rapidly but smoothly in WLMD (Fig. [5j upper 
panel) while the temperature changes rather slowly in 
REMD (Fig. [5l lower panel). One reason for such a 
behavior in WLMD is that the multicanonical temper- 
ature can vary continuously between T m ; n (200 K) and 
Tmax (700 K) as it is determined by Eq. (JTSJ). On the 
other hand, in REMD, there are only 24 allowed and fixed 
temperatures if we use 24 replicas so that the acceptance 
ratio of temperature swapping can be arbitrarily small. 

Although rapid traversal of a wide energy or temper- 
ature range is necessary, it is not sufficient for efficient 
sampling of conformational space. In fact, rapid energy 
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FIG. 5: Time series of temperatures of Trp-cage. Upper 
panel: Three trajectories from Wang-Landau molecular dy- 
namics simulations; the trajectories for the last 1 ns are en- 
larged for clarity. Lower panel: Trajectories of three replicas 
from replica exchange molecular dynamics simulations. In 
both figures, each trajectory corresponds to that of Fig. 0] 



change may be associated with very localized motions 
such as bond stretching. In order to examine the sam- 
pling efficiency of WLMD, we first checked the trajecto- 
ries of the coordinate root mean square deviation (cRMS) 
from the native structure of Trp-cage (Fig. [5]). As ex- 
pected, cRMS changes more slowly (Fig. [5]) than might 
be suggested by the energy trajectories (Fig. [3]) in both 
WLMD and REMD. Nevertheless, WLMD is much less 
prone to being trapped in local minima and conforma- 
tions appear to keep moving compared to REMD. A sim- 
ilar trend was observed for the trajectories of end-to-end 
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FIG. 6: Trajectories of coordinate root mean square devia- 
tions (cRMS) from the native structure of Trp-cage. Upper 
panel: WLMD; lower panel: REMD. Each trajectory corre- 
sponds to that of Fig. |4] 



distance (not shown). 

In order to visualize the efficiency of conformational 
sampling, we carried out principal component analy- 
sis [21j . Out of 1 200 000 conformations saved during 24 
WLMD simulations, 50 000 lowest energy conformations 
were extracted, to which 50 000 lowest energy conforma- 
tions out of 1 200 000 saved during the REMD simulation 
(24 replicas) were added. These 100000 conformations 
were used to define the principal axes. Conformations 
sampled in WLMD and REMD were then projected onto 
the principal space (Fig. [7]). It is apparent from Fig. 
[7] that WLMD covers much more space than REMD. 
To confirm this observation more quantitatively, we di- 
vided the space spanned by the first three principal axes 
(each from -25 to 25 A) into small cells (1.0 x 1.0 x 1.0 
A 3 ), and counted the number of occupied cells. Out of 
125 000 cells, 4 335 (including the cell containing the na- 
tive structure) were occupied by WLMD while only 1 174 
by REMD and the native structure cell was not occupied. 
Therefore, in terms of the cell occupancy, WLMD is 3.7 
times more efficient than REMD for sampling protein 
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pal space spanned by the first and second principal axes. Up- 
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according to the cRMS deviation from the native structure of 
Trp-cage. 



conformations. 



In summary, we formulated the Wang-Landau molec- 
ular dynamics method. In so doing, we adopted the 
Gaussian masking for updating the density of states, 
and developed a technique to reliably estimate the mul- 
ticanonical temperature. It was shown that the WLMD 
method indeed samples conformational space of a protein 
more efficiently than the replica exchange MD method. 
Apart from the inaccuracy in the molecular force field, 
the present method will serve as a useful tool for simula- 
tion studies of protein molecules. 
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